<!DOCTYPE html>

<html>
<head>
<meta charset="UTF-8">
<link href="style.css" type="text/css" rel="stylesheet">
<title>VRNDSCALESS—Round Scalar Float32 Value To Include A Given Number Of Fraction Bits </title></head>
<body>
<h1>VRNDSCALESS—Round Scalar Float32 Value To Include A Given Number Of Fraction Bits</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op /En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>
<p>EVEX.NDS.LIG.66.0F3A.W0 0A /r ib</p>
<p>VRNDSCALESS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}, imm8</p></td>
<td>T1S</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Rounds scalar single-precision floating-point value in xmm3/m32 to a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register under writemask.</td></tr></table>
<h3>Instruction Operand Encoding</h3>
<table>
<tr>
<td>Op/En</td>
<td>Operand 1</td>
<td>Operand 2</td>
<td>Operand 3</td>
<td>Operand 4</td></tr>
<tr>
<td>T1S</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>NA</td></tr></table>
<p><strong>Description</strong></p>
<p>Rounds the single-precision floating-point value in the low doubleword element of the second source operand (the third operand) by the rounding mode specified in the immediate operand (see Figure 5-29) and places the result in the corresponding element of the destination operand (the first operand) according to the writemask. The double-word elements at bits 127:32 of the destination are copied from the first source operand (the second operand).</p>
<p>The destination and first source operands are XMM registers, the 2nd source operand can be an XMM register or memory location. Bits MAX_VL-1:128 of the destination register are cleared.</p>
<p>The rounding process rounds the input to an integral value, plus number bits of fraction that are specified by imm8[7:4] (to be included in the result) and returns the result as a single-precision floating-point value.</p>
<p>It should be noticed that no overflow is induced while executing this instruction (although the source is scaled by the imm8[7:4] value).</p>
<p>The immediate operand also specifies control fields for the rounding operation, three bit fields are defined and shown in the “Immediate Control Description” figure below. Bit 3 of the immediate byte controls the processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Immediate control tables below lists the encoded values for rounding-mode field).</p>
<p>The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to ‘1 then denormals will be converted to zero before rounding.</p>
<p>The sign of the result of this instruction is preserved, including the sign of zero.</p>
<p>The formula of the operation for VRNDSCALESS is</p>
<p>ROUND(x) = 2<sup>-M</sup>*Round_to_INT(x*2<sup>M</sup>, round_ctrl),</p>
<p>round_ctrl = imm[3:0];</p>
<p>M=imm[7:4];</p>
<p>The operation of x*2<sup>M</sup> is computed as if the exponent range is unlimited (i.e. no overflow ever occurs).</p>
<p>VRNDSCALESS is a more general form of the VEX-encoded VROUNDSS instruction. In VROUNDSS, the formula of the operation on each element is</p>
<p>ROUND(x) = Round_to_INT(x, round_ctrl),</p>
<p>round_ctrl = imm[3:0];</p>
<p>EVEX encoded version: The source operand is a XMM register or a 32-bit memory location. The destination operand is a XMM register.</p>
<p>Handling of special case of input values are listed in Table 5-22.</p>
<p><strong>Operation</strong></p>
<p>RoundToIntegerSP(SRC[31:0], imm8[7:0]) {</p>
<p>if (imm8[2] = 1)</p>
<p>rounding_direction (cid:197) MXCSR:RC</p>
<p>; get round control from MXCSR</p>
<p>else</p>
<p>rounding_direction (cid:197) imm8[1:0]</p>
<p>; get round control from imm8[1:0]</p>
<p>FI</p>
<p>M (cid:197) imm8[7:4]</p>
<p>; get the scaling factor</p>
<p>case (rounding_direction)</p>
<p>00: TMP[31:0] (cid:197) round_to_nearest_even_integer(2<sup>M</sup>*SRC[31:0])</p>
<p>01: TMP[31:0] (cid:197) round_to_equal_or_smaller_integer(2<sup>M</sup>*SRC[31:0])</p>
<p>10: TMP[31:0] (cid:197) round_to_equal_or_larger_integer(2<sup>M</sup>*SRC[31:0])</p>
<p>11: TMP[31:0] (cid:197) round_to_nearest_smallest_magnitude_integer(2<sup>M</sup>*SRC[31:0])</p>
<p>ESAC;</p>
<p>Dest[31:0] (cid:197) 2<sup>-M</sup>* TMP[31:0]</p>
<p>; scale down back to 2<sup>-M</sup></p>
<p>if (imm8[3] = 0) Then</p>
<p>; check SPE</p>
<p>if (SRC[31:0] != Dest[31:0]) Then</p>
<p>; check precision lost</p>
<p>set_precision()</p>
<p>; set #PE</p>
<p>FI;</p>
<p>FI;</p>
<p>return(Dest[31:0])</p>
<p>}</p>
<p>VRNDSCALESS (EVEX encoded version)</p>
<p>IF k1[0] or *no writemask*</p>
<p>THEN</p>
<p>DEST[31:0] (cid:197) RoundToIntegerSP(SRC2[31:0], Zero_upper_imm[7:0])</p>
<p>ELSE</p>
<p>IF *merging-masking*</p>
<p>; merging-masking</p>
<p>THEN *DEST[31:0] remains unchanged*</p>
<p>ELSE</p>
<p>; zeroing-masking</p>
<p>THEN DEST[31:0] (cid:197) 0</p>
<p>FI;</p>
<p>FI;</p>
<p>DEST[127:32] (cid:197) SRC1[127:32]</p>
<p>DEST[MAX_VL-1:128] (cid:197) 0</p>
<p><strong>Intel C/C++ Compiler Intrinsic Equivalent</strong></p>
<p>VRNDSCALESS __m128 _mm_roundscale_ss ( __m128 a, __m128 b, int imm);</p>
<p>VRNDSCALESS __m128 _mm_roundscale_round_ss ( __m128 a, __m128 b, int imm, int sae);</p>
<p>VRNDSCALESS __m128 _mm_mask_roundscale_ss (__m128 s, __mmask8 k, __m128 a, __m128 b, int imm);</p>
<p>VRNDSCALESS __m128 _mm_mask_roundscale_round_ss (__m128 s, __mmask8 k, __m128 a, __m128 b, int imm, int sae);</p>
<p>VRNDSCALESS __m128 _mm_maskz_roundscale_ss ( __mmask8 k, __m128 a, __m128 b, int imm);</p>
<p>VRNDSCALESS __m128 _mm_maskz_roundscale_round_ss ( __mmask8 k, __m128 a, __m128 b, int imm, int sae);</p>
<p><strong>SIMD Floating-Point Exceptions</strong></p>
<p>Invalid, Precision</p>
<p>If SPE is enabled, precision exception is not reported (regardless of MXCSR exception mask).</p>
<p><strong>Other Exceptions</strong></p>
<p>See Exceptions Type E3.</p></body></html>